Dataset statistics
| Number of variables | 41 |
|---|---|
| Number of observations | 47520 |
| Missing cells | 37330 |
| Missing cells (%) | 1.9% |
| Total size in memory | 14.9 MiB |
| Average record size in memory | 328.0 B |
Variable types
| Numeric | 10 |
|---|---|
| Text | 29 |
| Boolean | 2 |
recorded_by has constant value "" | Constant |
public_meeting is highly imbalanced (56.0%) | Imbalance |
funder has 2877 (6.1%) missing values | Missing |
installer has 2889 (6.1%) missing values | Missing |
public_meeting has 2689 (5.7%) missing values | Missing |
scheme_management has 3103 (6.5%) missing values | Missing |
scheme_name has 23036 (48.5%) missing values | Missing |
permit has 2439 (5.1%) missing values | Missing |
amount_tsh is highly skewed (γ1 = 57.2301714) | Skewed |
num_private is highly skewed (γ1 = 89.07841041) | Skewed |
id has unique values | Unique |
amount_tsh has 33331 (70.1%) zeros | Zeros |
gps_height has 16275 (34.2%) zeros | Zeros |
longitude has 1433 (3.0%) zeros | Zeros |
num_private has 46903 (98.7%) zeros | Zeros |
population has 17048 (35.9%) zeros | Zeros |
construction_year has 16503 (34.7%) zeros | Zeros |
Reproduction
| Analysis started | 2024-02-09 10:28:59.066490 |
|---|---|
| Analysis finished | 2024-02-09 10:29:03.471850 |
| Duration | 4.41 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
id
Real number (ℝ)
UNIQUE 
| Distinct | 47520 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37114.48641 |
| Minimum | 0 |
|---|---|
| Maximum | 74247 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3733.9 |
| Q1 | 18555.75 |
| median | 37038 |
| Q3 | 55666.25 |
| 95-th percentile | 70566.05 |
| Maximum | 74247 |
| Range | 74247 |
| Interquartile range (IQR) | 37110.5 |
Descriptive statistics
| Standard deviation | 21445.76541 |
|---|---|
| Coefficient of variation (CV) | 0.5778273521 |
| Kurtosis | -1.199055428 |
| Mean | 37114.48641 |
| Median Absolute Deviation (MAD) | 18558.5 |
| Skewness | 0.002774336093 |
| Sum | 1763680394 |
| Variance | 459920853.8 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 454 | 1 | < 0.1% |
| 49218 | 1 | < 0.1% |
| 10673 | 1 | < 0.1% |
| 20940 | 1 | < 0.1% |
| 67861 | 1 | < 0.1% |
| 68334 | 1 | < 0.1% |
| 533 | 1 | < 0.1% |
| 30019 | 1 | < 0.1% |
| 66595 | 1 | < 0.1% |
| 17276 | 1 | < 0.1% |
| Other values (47510) | 47510 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 4 | 1 | |
| 6 | 1 |
| Value | Count | Frequency (%) |
| 74247 | 1 | |
| 74243 | 1 | |
| 74242 | 1 | |
| 74240 | 1 | |
| 74239 | 1 |
amount_tsh
Real number (ℝ)
SKEWED  ZEROS 
| Distinct | 96 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 322.0475726 |
| Minimum | 0 |
|---|---|
| Maximum | 350000 |
| Zeros | 33331 |
| Zeros (%) | 70.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 20 |
| 95-th percentile | 1200 |
| Maximum | 350000 |
| Range | 350000 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 3200.623244 |
|---|---|
| Coefficient of variation (CV) | 9.938355435 |
| Kurtosis | 4638.375637 |
| Mean | 322.0475726 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 57.2301714 |
| Sum | 15303700.65 |
| Variance | 10243989.15 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 33331 | |
| 500 | 2488 | 5.2% |
| 50 | 1986 | 4.2% |
| 20 | 1185 | 2.5% |
| 1000 | 1167 | 2.5% |
| 200 | 980 | 2.1% |
| 100 | 653 | 1.4% |
| 10 | 649 | 1.4% |
| 30 | 607 | 1.3% |
| 2000 | 559 | 1.2% |
| Other values (86) | 3915 | 8.2% |
| Value | Count | Frequency (%) |
| 0 | 33331 | |
| 0.2 | 2 | < 0.1% |
| 0.25 | 1 | < 0.1% |
| 1 | 2 | < 0.1% |
| 2 | 11 | < 0.1% |
| Value | Count | Frequency (%) |
| 350000 | 1 | |
| 250000 | 1 | |
| 200000 | 1 | |
| 170000 | 1 | |
| 120000 | 1 |
date_recorded
Text
| Distinct | 351 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 475200 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 31 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 2013-02-27 |
|---|---|
| 2nd row | 2011-03-17 |
| 3rd row | 2011-07-10 |
| 4th row | 2011-04-12 |
| 5th row | 2011-04-05 |
| Value | Count | Frequency (%) |
| 2011-03-15 | 459 | 1.0% |
| 2011-03-17 | 458 | 1.0% |
| 2013-02-03 | 442 | 0.9% |
| 2011-03-14 | 438 | 0.9% |
| 2011-03-16 | 394 | 0.8% |
| 2011-03-18 | 381 | 0.8% |
| 2011-03-04 | 379 | 0.8% |
| 2011-03-19 | 371 | 0.8% |
| 2013-02-14 | 368 | 0.8% |
| 2013-01-29 | 365 | 0.8% |
| Other values (341) | 43465 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 111199 | |
| 1 | 103202 | |
| - | 95040 | |
| 2 | 83096 | |
| 3 | 42279 | 8.9% |
| 7 | 10258 | 2.2% |
| 4 | 8602 | 1.8% |
| 8 | 7477 | 1.6% |
| 6 | 4895 | 1.0% |
| 5 | 4861 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 380160 | |
| Dash Punctuation | 95040 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 111199 | |
| 1 | 103202 | |
| 2 | 83096 | |
| 3 | 42279 | 11.1% |
| 7 | 10258 | 2.7% |
| 4 | 8602 | 2.3% |
| 8 | 7477 | 2.0% |
| 6 | 4895 | 1.3% |
| 5 | 4861 | 1.3% |
| 9 | 4291 | 1.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 95040 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 475200 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 111199 | |
| 1 | 103202 | |
| - | 95040 | |
| 2 | 83096 | |
| 3 | 42279 | 8.9% |
| 7 | 10258 | 2.2% |
| 4 | 8602 | 1.8% |
| 8 | 7477 | 1.6% |
| 6 | 4895 | 1.0% |
| 5 | 4861 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 475200 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 111199 | |
| 1 | 103202 | |
| - | 95040 | |
| 2 | 83096 | |
| 3 | 42279 | 8.9% |
| 7 | 10258 | 2.2% |
| 4 | 8602 | 1.8% |
| 8 | 7477 | 1.6% |
| 6 | 4895 | 1.0% |
| 5 | 4861 | 1.0% |
funder
Text
MISSING 
| Distinct | 1697 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 2877 |
| Missing (%) | 6.1% |
| Memory size | 371.4 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 27 |
| Mean length | 9.919629057 |
| Min length | 1 |
Characters and Unicode
| Total characters | 442842 |
|---|---|
| Distinct characters | 69 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 865 ? |
|---|---|
| Unique (%) | 1.9% |
Sample
| 1st row | Dmdd |
|---|---|
| 2nd row | Cmsr |
| 3rd row | Kkkt |
| 4th row | Ki |
| 5th row | Hesawa |
| Value | Count | Frequency (%) |
| of | 7794 | 10.8% |
| government | 7406 | 10.2% |
| tanzania | 7320 | 10.1% |
| danida | 2496 | 3.5% |
| world | 2232 | 3.1% |
| water | 2140 | 3.0% |
| hesawa | 1795 | 2.5% |
| bank | 1133 | 1.6% |
| rwssp | 1107 | 1.5% |
| kkkt | 1107 | 1.5% |
| Other values (1861) | 37753 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 54523 | 12.3% |
| n | 46193 | 10.4% |
| i | 30358 | 6.9% |
| e | 30023 | 6.8% |
| 27687 | 6.3% | |
| r | 22343 | 5.0% |
| t | 18436 | 4.2% |
| o | 18219 | 4.1% |
| s | 13776 | 3.1% |
| d | 12416 | 2.8% |
| Other values (59) | 168868 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 340602 | |
| Uppercase Letter | 71734 | 16.2% |
| Space Separator | 27687 | 6.3% |
| Other Punctuation | 1075 | 0.2% |
| Decimal Number | 651 | 0.1% |
| Open Punctuation | 355 | 0.1% |
| Close Punctuation | 350 | 0.1% |
| Dash Punctuation | 265 | 0.1% |
| Connector Punctuation | 123 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 54523 | |
| n | 46193 | |
| i | 30358 | 8.9% |
| e | 30023 | 8.8% |
| r | 22343 | 6.6% |
| t | 18436 | 5.4% |
| o | 18219 | 5.3% |
| s | 13776 | 4.0% |
| d | 12416 | 3.6% |
| f | 12282 | 3.6% |
| Other values (16) | 82033 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 9696 | |
| G | 8558 | |
| O | 8483 | |
| D | 6329 | 8.8% |
| W | 5908 | 8.2% |
| C | 3735 | 5.2% |
| R | 3541 | 4.9% |
| H | 2802 | 3.9% |
| M | 2488 | 3.5% |
| A | 2363 | 3.3% |
| Other values (16) | 17831 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 643 | |
| 2 | 3 | 0.5% |
| 9 | 2 | 0.3% |
| 1 | 2 | 0.3% |
| 4 | 1 | 0.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 639 | |
| . | 376 | |
| \ | 30 | 2.8% |
| & | 22 | 2.0% |
| ' | 8 | 0.7% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 352 | |
| [ | 3 | 0.8% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 348 | |
| ] | 2 | 0.6% |
Space Separator
| Value | Count | Frequency (%) |
| 27687 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 265 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 123 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 412336 | |
| Common | 30506 | 6.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 54523 | 13.2% |
| n | 46193 | 11.2% |
| i | 30358 | 7.4% |
| e | 30023 | 7.3% |
| r | 22343 | 5.4% |
| t | 18436 | 4.5% |
| o | 18219 | 4.4% |
| s | 13776 | 3.3% |
| d | 12416 | 3.0% |
| f | 12282 | 3.0% |
| Other values (42) | 153767 |
Common
| Value | Count | Frequency (%) |
| 27687 | ||
| 0 | 643 | 2.1% |
| / | 639 | 2.1% |
| . | 376 | 1.2% |
| ( | 352 | 1.2% |
| ) | 348 | 1.1% |
| - | 265 | 0.9% |
| _ | 123 | 0.4% |
| \ | 30 | 0.1% |
| & | 22 | 0.1% |
| Other values (7) | 21 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 442842 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 54523 | 12.3% |
| n | 46193 | 10.4% |
| i | 30358 | 6.9% |
| e | 30023 | 6.8% |
| 27687 | 6.3% | |
| r | 22343 | 5.0% |
| t | 18436 | 4.2% |
| o | 18219 | 4.1% |
| s | 13776 | 3.1% |
| d | 12416 | 2.8% |
| Other values (59) | 168868 |
gps_height
Real number (ℝ)
ZEROS 
| Distinct | 2401 |
|---|---|
| Distinct (%) | 5.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 668.7453704 |
| Minimum | -63 |
|---|---|
| Maximum | 2770 |
| Zeros | 16275 |
| Zeros (%) | 34.2% |
| Negative | 1203 |
| Negative (%) | 2.5% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | -63 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 370 |
| Q3 | 1320 |
| 95-th percentile | 1797 |
| Maximum | 2770 |
| Range | 2833 |
| Interquartile range (IQR) | 1320 |
Descriptive statistics
| Standard deviation | 692.9721534 |
|---|---|
| Coefficient of variation (CV) | 1.036227216 |
| Kurtosis | -1.291294408 |
| Mean | 668.7453704 |
| Median Absolute Deviation (MAD) | 370 |
| Skewness | 0.4621979983 |
| Sum | 31778780 |
| Variance | 480210.4054 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 16275 | |
| -15 | 52 | 0.1% |
| -16 | 49 | 0.1% |
| -13 | 45 | 0.1% |
| -20 | 43 | 0.1% |
| 1290 | 42 | 0.1% |
| -14 | 41 | 0.1% |
| -27 | 39 | 0.1% |
| 1269 | 39 | 0.1% |
| 1304 | 39 | 0.1% |
| Other values (2391) | 30856 |
| Value | Count | Frequency (%) |
| -63 | 2 | |
| -59 | 1 | |
| -57 | 1 | |
| -55 | 1 | |
| -54 | 1 |
| Value | Count | Frequency (%) |
| 2770 | 1 | |
| 2628 | 1 | |
| 2627 | 1 | |
| 2626 | 2 | |
| 2614 | 1 |
installer
Text
MISSING 
| Distinct | 1923 |
|---|---|
| Distinct (%) | 4.3% |
| Missing | 2889 |
| Missing (%) | 6.1% |
| Memory size | 371.4 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 29 |
| Mean length | 6.103605118 |
| Min length | 1 |
Characters and Unicode
| Total characters | 272410 |
|---|---|
| Distinct characters | 69 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 974 ? |
|---|---|
| Unique (%) | 2.2% |
Sample
| 1st row | DMDD |
|---|---|
| 2nd row | Gove |
| 3rd row | KKKT |
| 4th row | Ki |
| 5th row | DWE |
| Value | Count | Frequency (%) |
| dwe | 14097 | |
| government | 2177 | 4.0% |
| water | 1495 | 2.7% |
| hesawa | 1153 | 2.1% |
| rwe | 991 | 1.8% |
| district | 964 | 1.8% |
| kkkt | 922 | 1.7% |
| council | 882 | 1.6% |
| commu | 854 | 1.6% |
| danida | 836 | 1.5% |
| Other values (1791) | 30248 |
Most occurring characters
| Value | Count | Frequency (%) |
| D | 22054 | 8.1% |
| W | 20701 | 7.6% |
| E | 20370 | 7.5% |
| a | 13920 | 5.1% |
| n | 13158 | 4.8% |
| e | 12367 | 4.5% |
| i | 11985 | 4.4% |
| A | 10938 | 4.0% |
| r | 10676 | 3.9% |
| t | 10272 | 3.8% |
| Other values (59) | 125969 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 134003 | |
| Lowercase Letter | 126376 | |
| Space Separator | 10099 | 3.7% |
| Other Punctuation | 798 | 0.3% |
| Decimal Number | 639 | 0.2% |
| Dash Punctuation | 222 | 0.1% |
| Open Punctuation | 131 | < 0.1% |
| Connector Punctuation | 125 | < 0.1% |
| Close Punctuation | 15 | < 0.1% |
| Currency Symbol | 2 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 22054 | |
| W | 20701 | |
| E | 20370 | |
| A | 10938 | |
| C | 8449 | 6.3% |
| S | 5354 | 4.0% |
| R | 5189 | 3.9% |
| I | 4975 | 3.7% |
| T | 4779 | 3.6% |
| K | 4275 | 3.2% |
| Other values (16) | 26919 |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 13920 | |
| n | 13158 | |
| e | 12367 | |
| i | 11985 | |
| r | 10676 | 8.4% |
| t | 10272 | 8.1% |
| o | 9915 | 7.8% |
| m | 7447 | 5.9% |
| s | 4958 | 3.9% |
| l | 4952 | 3.9% |
| Other values (16) | 26726 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 556 | |
| . | 187 | 23.4% |
| & | 44 | 5.5% |
| ' | 10 | 1.3% |
| # | 1 | 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 636 | |
| 9 | 1 | 0.2% |
| 4 | 1 | 0.2% |
| 1 | 1 | 0.2% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 129 | |
| [ | 2 | 1.5% |
Close Punctuation
| Value | Count | Frequency (%) |
| } | 13 | |
| ] | 2 | 13.3% |
Space Separator
| Value | Count | Frequency (%) |
| 10099 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 222 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 125 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 260379 | |
| Common | 12031 | 4.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| D | 22054 | 8.5% |
| W | 20701 | 8.0% |
| E | 20370 | 7.8% |
| a | 13920 | 5.3% |
| n | 13158 | 5.1% |
| e | 12367 | 4.7% |
| i | 11985 | 4.6% |
| A | 10938 | 4.2% |
| r | 10676 | 4.1% |
| t | 10272 | 3.9% |
| Other values (42) | 113938 |
Common
| Value | Count | Frequency (%) |
| 10099 | ||
| 0 | 636 | 5.3% |
| / | 556 | 4.6% |
| - | 222 | 1.8% |
| . | 187 | 1.6% |
| ( | 129 | 1.1% |
| _ | 125 | 1.0% |
| & | 44 | 0.4% |
| } | 13 | 0.1% |
| ' | 10 | 0.1% |
| Other values (7) | 10 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 272410 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| D | 22054 | 8.1% |
| W | 20701 | 7.6% |
| E | 20370 | 7.5% |
| a | 13920 | 5.1% |
| n | 13158 | 4.8% |
| e | 12367 | 4.5% |
| i | 11985 | 4.4% |
| A | 10938 | 4.0% |
| r | 10676 | 3.9% |
| t | 10272 | 3.8% |
| Other values (59) | 125969 |
longitude
Real number (ℝ)
ZEROS 
| Distinct | 46043 |
|---|---|
| Distinct (%) | 96.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34.09131645 |
| Minimum | 0 |
|---|---|
| Maximum | 40.34519307 |
| Zeros | 1433 |
| Zeros (%) | 3.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 30.04355494 |
| Q1 | 33.08431976 |
| median | 34.91167698 |
| Q3 | 37.18058514 |
| 95-th percentile | 39.13658192 |
| Maximum | 40.34519307 |
| Range | 40.34519307 |
| Interquartile range (IQR) | 4.096265385 |
Descriptive statistics
| Standard deviation | 6.538402533 |
|---|---|
| Coefficient of variation (CV) | 0.1917908492 |
| Kurtosis | 19.36254803 |
| Mean | 34.09131645 |
| Median Absolute Deviation (MAD) | 2.03768666 |
| Skewness | -4.203791707 |
| Sum | 1620019.358 |
| Variance | 42.75070768 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1433 | 3.0% |
| 39.09568416 | 2 | < 0.1% |
| 39.09649867 | 2 | < 0.1% |
| 39.09348389 | 2 | < 0.1% |
| 39.10124424 | 2 | < 0.1% |
| 37.33981057 | 2 | < 0.1% |
| 37.53277831 | 2 | < 0.1% |
| 32.96700926 | 2 | < 0.1% |
| 39.09143391 | 2 | < 0.1% |
| 39.09906887 | 2 | < 0.1% |
| Other values (46033) | 46069 |
| Value | Count | Frequency (%) |
| 0 | 1433 | |
| 29.6071219 | 1 | < 0.1% |
| 29.61032056 | 1 | < 0.1% |
| 29.61096482 | 1 | < 0.1% |
| 29.61194674 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 40.34519307 | 1 | |
| 40.34430089 | 1 | |
| 40.32523996 | 1 | |
| 40.32522643 | 1 | |
| 40.32340181 | 1 |
latitude
Real number (ℝ)
| Distinct | 46044 |
|---|---|
| Distinct (%) | 96.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -5.705002278 |
| Minimum | -11.64944018 |
|---|---|
| Maximum | -2 × 10-8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 47520 |
| Negative (%) | 100.0% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | -11.64944018 |
|---|---|
| 5-th percentile | -10.58403249 |
| Q1 | -8.532465267 |
| median | -5.017697195 |
| Q3 | -3.326464222 |
| 95-th percentile | -1.417337671 |
| Maximum | -2 × 10-8 |
| Range | 11.64944016 |
| Interquartile range (IQR) | 5.206001045 |
Descriptive statistics
| Standard deviation | 2.943502774 |
|---|---|
| Coefficient of variation (CV) | -0.5159512004 |
| Kurtosis | -1.057146654 |
| Mean | -5.705002278 |
| Median Absolute Deviation (MAD) | 2.07045949 |
| Skewness | -0.1540169554 |
| Sum | -271101.7082 |
| Variance | 8.664208578 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -2 × 10-8 | 1433 | 3.0% |
| -6.99261144 | 2 | < 0.1% |
| -2.47667983 | 2 | < 0.1% |
| -6.9802163 | 2 | < 0.1% |
| -6.98945622 | 2 | < 0.1% |
| -6.9813255 | 2 | < 0.1% |
| -7.06537264 | 2 | < 0.1% |
| -6.96247516 | 2 | < 0.1% |
| -2.50658954 | 2 | < 0.1% |
| -6.97826294 | 2 | < 0.1% |
| Other values (46034) | 46069 |
| Value | Count | Frequency (%) |
| -11.64944018 | 1 | |
| -11.64837759 | 1 | |
| -11.58629656 | 1 | |
| -11.56857679 | 1 | |
| -11.56680457 | 1 |
| Value | Count | Frequency (%) |
| -2 × 10-8 | 1433 | |
| -0.99846435 | 1 | < 0.1% |
| -0.998916 | 1 | < 0.1% |
| -0.99901209 | 1 | < 0.1% |
| -0.9994692 | 1 | < 0.1% |
wpt_name
Text
| Distinct | 30741 |
|---|---|
| Distinct (%) | 64.7% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Memory size | 371.4 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 25 |
| Mean length | 10.9545445 |
| Min length | 1 |
Characters and Unicode
| Total characters | 520549 |
|---|---|
| Distinct characters | 74 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 27331 ? |
|---|---|
| Unique (%) | 57.5% |
Sample
| 1st row | Narmo |
|---|---|
| 2nd row | Lukali |
| 3rd row | Mahakama |
| 4th row | Shule Ya Msingi Chosi A |
| 5th row | Kwa Mjowe |
| Value | Count | Frequency (%) |
| kwa | 17072 | 19.5% |
| none | 2858 | 3.3% |
| mzee | 2699 | 3.1% |
| shuleni | 1659 | 1.9% |
| ya | 1196 | 1.4% |
| shule | 1095 | 1.3% |
| school | 880 | 1.0% |
| primary | 827 | 0.9% |
| zahanati | 778 | 0.9% |
| msingi | 693 | 0.8% |
| Other values (24966) | 57611 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 78885 | |
| i | 41759 | 8.0% |
| 39853 | 7.7% | |
| n | 33594 | 6.5% |
| e | 32920 | 6.3% |
| w | 25319 | 4.9% |
| K | 25060 | 4.8% |
| o | 24321 | 4.7% |
| u | 19321 | 3.7% |
| M | 17624 | 3.4% |
| Other values (64) | 181893 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 394556 | |
| Uppercase Letter | 84048 | 16.1% |
| Space Separator | 39853 | 7.7% |
| Decimal Number | 1343 | 0.3% |
| Other Punctuation | 581 | 0.1% |
| Dash Punctuation | 87 | < 0.1% |
| Open Punctuation | 26 | < 0.1% |
| Close Punctuation | 26 | < 0.1% |
| Connector Punctuation | 16 | < 0.1% |
| Modifier Symbol | 13 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 78885 | |
| i | 41759 | |
| n | 33594 | 8.5% |
| e | 32920 | 8.3% |
| w | 25319 | 6.4% |
| o | 24321 | 6.2% |
| u | 19321 | 4.9% |
| l | 16705 | 4.2% |
| m | 14168 | 3.6% |
| h | 13757 | 3.5% |
| Other values (16) | 93807 |
Uppercase Letter
| Value | Count | Frequency (%) |
| K | 25060 | |
| M | 17624 | |
| S | 8558 | 10.2% |
| N | 3923 | 4.7% |
| A | 2789 | 3.3% |
| B | 2723 | 3.2% |
| C | 2256 | 2.7% |
| P | 2039 | 2.4% |
| L | 2014 | 2.4% |
| J | 1889 | 2.2% |
| Other values (16) | 15173 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 410 | |
| 2 | 357 | |
| 3 | 120 | 8.9% |
| 4 | 93 | 6.9% |
| 7 | 78 | 5.8% |
| 6 | 68 | 5.1% |
| 5 | 67 | 5.0% |
| 8 | 57 | 4.2% |
| 9 | 55 | 4.1% |
| 0 | 38 | 2.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 339 | |
| . | 134 | 23.1% |
| / | 106 | 18.2% |
| & | 2 | 0.3% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 20 | |
| [ | 6 | 23.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 20 | |
| ] | 6 | 23.1% |
Space Separator
| Value | Count | Frequency (%) |
| 39853 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 87 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 16 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 13 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 478604 | |
| Common | 41945 | 8.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 78885 | |
| i | 41759 | 8.7% |
| n | 33594 | 7.0% |
| e | 32920 | 6.9% |
| w | 25319 | 5.3% |
| K | 25060 | 5.2% |
| o | 24321 | 5.1% |
| u | 19321 | 4.0% |
| M | 17624 | 3.7% |
| l | 16705 | 3.5% |
| Other values (42) | 163096 |
Common
| Value | Count | Frequency (%) |
| 39853 | ||
| 1 | 410 | 1.0% |
| 2 | 357 | 0.9% |
| ' | 339 | 0.8% |
| . | 134 | 0.3% |
| 3 | 120 | 0.3% |
| / | 106 | 0.3% |
| 4 | 93 | 0.2% |
| - | 87 | 0.2% |
| 7 | 78 | 0.2% |
| Other values (12) | 368 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 520549 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 78885 | |
| i | 41759 | 8.0% |
| 39853 | 7.7% | |
| n | 33594 | 6.5% |
| e | 32920 | 6.3% |
| w | 25319 | 4.9% |
| K | 25060 | 4.8% |
| o | 24321 | 4.7% |
| u | 19321 | 3.7% |
| M | 17624 | 3.4% |
| Other values (64) | 181893 |
num_private
Real number (ℝ)
SKEWED  ZEROS 
| Distinct | 59 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5045664983 |
| Minimum | 0 |
|---|---|
| Maximum | 1776 |
| Zeros | 46903 |
| Zeros (%) | 98.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 1776 |
| Range | 1776 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 13.25384979 |
|---|---|
| Coefficient of variation (CV) | 26.2677959 |
| Kurtosis | 10076.14263 |
| Mean | 0.5045664983 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 89.07841041 |
| Sum | 23977 |
| Variance | 175.6645344 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 46903 | |
| 6 | 61 | 0.1% |
| 1 | 56 | 0.1% |
| 5 | 37 | 0.1% |
| 8 | 37 | 0.1% |
| 32 | 35 | 0.1% |
| 15 | 31 | 0.1% |
| 45 | 31 | 0.1% |
| 39 | 27 | 0.1% |
| 7 | 24 | 0.1% |
| Other values (49) | 278 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 46903 | |
| 1 | 56 | 0.1% |
| 2 | 19 | < 0.1% |
| 3 | 22 | < 0.1% |
| 4 | 15 | < 0.1% |
| Value | Count | Frequency (%) |
| 1776 | 1 | |
| 1402 | 1 | |
| 755 | 1 | |
| 698 | 1 | |
| 672 | 1 |
basin
Text
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 23 |
|---|---|
| Median length | 11 |
| Mean length | 10.89829545 |
| Min length | 6 |
Characters and Unicode
| Total characters | 517887 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Internal |
|---|---|
| 2nd row | Internal |
| 3rd row | Lake Rukwa |
| 4th row | Rufiji |
| 5th row | Wami / Ruvu |
| Value | Count | Frequency (%) |
| lake | 19374 | |
| 8404 | ||
| victoria | 8205 | |
| pangani | 7143 | 8.2% |
| rufiji | 6375 | 7.3% |
| internal | 6224 | 7.1% |
| tanganyika | 5169 | 5.9% |
| wami | 4804 | 5.5% |
| ruvu | 4804 | 5.5% |
| nyasa | 4014 | 4.6% |
| Other values (4) | 12786 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 85614 | |
| i | 46276 | 8.9% |
| n | 40672 | 7.9% |
| 39782 | 7.7% | |
| e | 29198 | 5.6% |
| u | 28769 | 5.6% |
| k | 26529 | 5.1% |
| t | 21629 | 4.2% |
| L | 19374 | 3.7% |
| r | 18029 | 3.5% |
| Other values (22) | 162015 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 390803 | |
| Uppercase Letter | 78898 | 15.2% |
| Space Separator | 39782 | 7.7% |
| Other Punctuation | 8404 | 1.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 85614 | |
| i | 46276 | |
| n | 40672 | |
| e | 29198 | 7.5% |
| u | 28769 | 7.4% |
| k | 26529 | 6.8% |
| t | 21629 | 5.5% |
| r | 18029 | 4.6% |
| o | 15405 | 3.9% |
| g | 12312 | 3.2% |
| Other values (10) | 66370 |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 19374 | |
| R | 16765 | |
| V | 8205 | |
| P | 7143 | 9.1% |
| I | 6224 | 7.9% |
| T | 5169 | 6.6% |
| W | 4804 | 6.1% |
| N | 4014 | 5.1% |
| S | 3600 | 4.6% |
| C | 3600 | 4.6% |
Space Separator
| Value | Count | Frequency (%) |
| 39782 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 8404 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 469701 | |
| Common | 48186 | 9.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 85614 | |
| i | 46276 | 9.9% |
| n | 40672 | 8.7% |
| e | 29198 | 6.2% |
| u | 28769 | 6.1% |
| k | 26529 | 5.6% |
| t | 21629 | 4.6% |
| L | 19374 | 4.1% |
| r | 18029 | 3.8% |
| R | 16765 | 3.6% |
| Other values (20) | 136846 |
Common
| Value | Count | Frequency (%) |
| 39782 | ||
| / | 8404 | 17.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 517887 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 85614 | |
| i | 46276 | 8.9% |
| n | 40672 | 7.9% |
| 39782 | 7.7% | |
| e | 29198 | 5.6% |
| u | 28769 | 5.6% |
| k | 26529 | 5.1% |
| t | 21629 | 4.2% |
| L | 19374 | 3.7% |
| r | 18029 | 3.5% |
| Other values (22) | 162015 |
subvillage
Text
| Distinct | 17232 |
|---|---|
| Distinct (%) | 36.5% |
| Missing | 296 |
| Missing (%) | 0.6% |
| Memory size | 371.4 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 26 |
| Mean length | 7.899690835 |
| Min length | 1 |
Characters and Unicode
| Total characters | 373055 |
|---|---|
| Distinct characters | 73 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 9012 ? |
|---|---|
| Unique (%) | 19.1% |
Sample
| 1st row | Bashnet Kati |
|---|---|
| 2nd row | Lukali |
| 3rd row | Chawalikozi |
| 4th row | Shuleni |
| 5th row | Ngholong |
| Value | Count | Frequency (%) |
| a | 1917 | 3.4% |
| b | 1636 | 2.9% |
| kati | 1529 | 2.7% |
| wa | 488 | 0.9% |
| shuleni | 486 | 0.9% |
| majengo | 481 | 0.8% |
| madukani | 449 | 0.8% |
| mtaa | 420 | 0.7% |
| juu | 330 | 0.6% |
| mjini | 309 | 0.5% |
| Other values (15351) | 48626 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 57714 | |
| i | 36697 | 9.8% |
| n | 26878 | 7.2% |
| u | 21144 | 5.7% |
| e | 20480 | 5.5% |
| o | 18727 | 5.0% |
| M | 16364 | 4.4% |
| g | 15119 | 4.1% |
| l | 13022 | 3.5% |
| m | 12046 | 3.2% |
| Other values (63) | 134864 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 305049 | |
| Uppercase Letter | 57076 | 15.3% |
| Space Separator | 9448 | 2.5% |
| Other Punctuation | 952 | 0.3% |
| Decimal Number | 459 | 0.1% |
| Dash Punctuation | 31 | < 0.1% |
| Modifier Symbol | 30 | < 0.1% |
| Open Punctuation | 4 | < 0.1% |
| Close Punctuation | 4 | < 0.1% |
| Connector Punctuation | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 57714 | |
| i | 36697 | |
| n | 26878 | 8.8% |
| u | 21144 | 6.9% |
| e | 20480 | 6.7% |
| o | 18727 | 6.1% |
| g | 15119 | 5.0% |
| l | 13022 | 4.3% |
| m | 12046 | 3.9% |
| b | 9466 | 3.1% |
| Other values (16) | 73756 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 16364 | |
| K | 10019 | |
| N | 4830 | 8.5% |
| B | 4116 | 7.2% |
| I | 3564 | 6.2% |
| S | 3272 | 5.7% |
| A | 2466 | 4.3% |
| C | 2009 | 3.5% |
| L | 1969 | 3.4% |
| U | 1388 | 2.4% |
| Other values (15) | 7079 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 181 | |
| 2 | 56 | 12.2% |
| 4 | 41 | 8.9% |
| 3 | 37 | 8.1% |
| 9 | 27 | 5.9% |
| 6 | 26 | 5.7% |
| 5 | 26 | 5.7% |
| 8 | 24 | 5.2% |
| 0 | 22 | 4.8% |
| 7 | 19 | 4.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 826 | |
| / | 101 | 10.6% |
| . | 23 | 2.4% |
| # | 2 | 0.2% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3 | |
| [ | 1 | 25.0% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3 | |
| ] | 1 | 25.0% |
Space Separator
| Value | Count | Frequency (%) |
| 9448 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 31 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 30 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 362125 | |
| Common | 10930 | 2.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 57714 | |
| i | 36697 | 10.1% |
| n | 26878 | 7.4% |
| u | 21144 | 5.8% |
| e | 20480 | 5.7% |
| o | 18727 | 5.2% |
| M | 16364 | 4.5% |
| g | 15119 | 4.2% |
| l | 13022 | 3.6% |
| m | 12046 | 3.3% |
| Other values (41) | 123934 |
Common
| Value | Count | Frequency (%) |
| 9448 | ||
| ' | 826 | 7.6% |
| 1 | 181 | 1.7% |
| / | 101 | 0.9% |
| 2 | 56 | 0.5% |
| 4 | 41 | 0.4% |
| 3 | 37 | 0.3% |
| - | 31 | 0.3% |
| ` | 30 | 0.3% |
| 9 | 27 | 0.2% |
| Other values (12) | 152 | 1.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 373055 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 57714 | |
| i | 36697 | 9.8% |
| n | 26878 | 7.2% |
| u | 21144 | 5.7% |
| e | 20480 | 5.5% |
| o | 18727 | 5.0% |
| M | 16364 | 4.4% |
| g | 15119 | 4.1% |
| l | 13022 | 3.5% |
| m | 12046 | 3.2% |
| Other values (63) | 134864 |
region
Text
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 13 |
|---|---|
| Median length | 11 |
| Mean length | 6.620896465 |
| Min length | 4 |
Characters and Unicode
| Total characters | 314625 |
|---|---|
| Distinct characters | 32 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Manyara |
|---|---|
| 2nd row | Dodoma |
| 3rd row | Mbeya |
| 4th row | Mbeya |
| 5th row | Morogoro |
| Value | Count | Frequency (%) |
| iringa | 4254 | 8.7% |
| shinyanga | 3977 | 8.1% |
| mbeya | 3659 | 7.5% |
| kilimanjaro | 3466 | 7.1% |
| morogoro | 3223 | 6.6% |
| arusha | 2692 | 5.5% |
| kagera | 2662 | 5.5% |
| mwanza | 2475 | 5.1% |
| kigoma | 2255 | 4.6% |
| pwani | 2115 | 4.3% |
| Other values (13) | 18050 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 66715 | |
| n | 26496 | 8.4% |
| r | 25982 | 8.3% |
| i | 25361 | 8.1% |
| o | 23701 | 7.5% |
| g | 20087 | 6.4% |
| M | 13587 | 4.3% |
| m | 10235 | 3.3% |
| y | 8902 | 2.8% |
| K | 8383 | 2.7% |
| Other values (22) | 85176 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 265143 | |
| Uppercase Letter | 48174 | 15.3% |
| Space Separator | 1308 | 0.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 66715 | |
| n | 26496 | 10.0% |
| r | 25982 | 9.8% |
| i | 25361 | 9.6% |
| o | 23701 | 8.9% |
| g | 20087 | 7.6% |
| m | 10235 | 3.9% |
| y | 8902 | 3.4% |
| u | 8356 | 3.2% |
| w | 7418 | 2.8% |
| Other values (11) | 41890 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 13587 | |
| K | 8383 | |
| S | 6295 | |
| I | 4254 | 8.8% |
| T | 3630 | 7.5% |
| R | 3559 | 7.4% |
| A | 2692 | 5.6% |
| D | 2409 | 5.0% |
| P | 2115 | 4.4% |
| L | 1250 | 2.6% |
Space Separator
| Value | Count | Frequency (%) |
| 1308 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 313317 | |
| Common | 1308 | 0.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 66715 | |
| n | 26496 | 8.5% |
| r | 25982 | 8.3% |
| i | 25361 | 8.1% |
| o | 23701 | 7.6% |
| g | 20087 | 6.4% |
| M | 13587 | 4.3% |
| m | 10235 | 3.3% |
| y | 8902 | 2.8% |
| K | 8383 | 2.7% |
| Other values (21) | 83868 |
Common
| Value | Count | Frequency (%) |
| 1308 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 314625 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 66715 | |
| n | 26496 | 8.4% |
| r | 25982 | 8.3% |
| i | 25361 | 8.1% |
| o | 23701 | 7.5% |
| g | 20087 | 6.4% |
| M | 13587 | 4.3% |
| m | 10235 | 3.3% |
| y | 8902 | 2.8% |
| K | 8383 | 2.7% |
| Other values (22) | 85176 |
region_code
Real number (ℝ)
| Distinct | 27 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.32651515 |
| Minimum | 1 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 12 |
| Q3 | 17 |
| 95-th percentile | 60 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 17.61879844 |
|---|---|
| Coefficient of variation (CV) | 1.149563241 |
| Kurtosis | 10.20445601 |
| Mean | 15.32651515 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 3.162410202 |
| Sum | 728316 |
| Variance | 310.4220584 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 4259 | 9.0% |
| 17 | 4000 | 8.4% |
| 12 | 3659 | 7.7% |
| 3 | 3466 | 7.3% |
| 5 | 3249 | 6.8% |
| 18 | 2669 | 5.6% |
| 2 | 2430 | 5.1% |
| 19 | 2429 | 5.1% |
| 16 | 2255 | 4.7% |
| 10 | 2105 | 4.4% |
| Other values (17) | 16999 |
| Value | Count | Frequency (%) |
| 1 | 1755 | |
| 2 | 2430 | |
| 3 | 3466 | |
| 4 | 2026 | |
| 5 | 3249 |
| Value | Count | Frequency (%) |
| 99 | 343 | 0.7% |
| 90 | 722 | |
| 80 | 1002 | |
| 60 | 839 | |
| 40 | 1 | < 0.1% |
district_code
Real number (ℝ)
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.639309764 |
| Minimum | 0 |
|---|---|
| Maximum | 80 |
| Zeros | 19 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 30 |
| Maximum | 80 |
| Range | 80 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 9.661284976 |
|---|---|
| Coefficient of variation (CV) | 1.713203456 |
| Kurtosis | 16.05677349 |
| Mean | 5.639309764 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 3.948457815 |
| Sum | 267980 |
| Variance | 93.34042739 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 9755 | |
| 2 | 8991 | |
| 3 | 7999 | |
| 4 | 7166 | |
| 5 | 3467 | 7.3% |
| 6 | 3275 | 6.9% |
| 7 | 2663 | 5.6% |
| 8 | 818 | 1.7% |
| 30 | 795 | 1.7% |
| 33 | 687 | 1.4% |
| Other values (10) | 1904 | 4.0% |
| Value | Count | Frequency (%) |
| 0 | 19 | < 0.1% |
| 1 | 9755 | |
| 2 | 8991 | |
| 3 | 7999 | |
| 4 | 7166 |
| Value | Count | Frequency (%) |
| 80 | 8 | < 0.1% |
| 67 | 3 | < 0.1% |
| 63 | 157 | |
| 62 | 87 | |
| 60 | 55 | 0.1% |
lga
Text
| Distinct | 125 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 16 |
|---|---|
| Median length | 14 |
| Mean length | 7.42081229 |
| Min length | 3 |
Characters and Unicode
| Total characters | 352637 |
|---|---|
| Distinct characters | 41 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Babati |
|---|---|
| 2nd row | Bahi |
| 3rd row | Mbozi |
| 4th row | Mbarali |
| 5th row | Kilosa |
| Value | Count | Frequency (%) |
| rural | 7642 | 13.5% |
| njombe | 2001 | 3.5% |
| urban | 1334 | 2.4% |
| moshi | 1070 | 1.9% |
| arusha | 1055 | 1.9% |
| bariadi | 948 | 1.7% |
| singida | 922 | 1.6% |
| kilosa | 876 | 1.6% |
| rungwe | 866 | 1.5% |
| mbozi | 844 | 1.5% |
| Other values (106) | 38938 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 55893 | |
| o | 24121 | 6.8% |
| i | 23581 | 6.7% |
| u | 22727 | 6.4% |
| r | 21574 | 6.1% |
| e | 18148 | 5.1% |
| n | 18019 | 5.1% |
| l | 15384 | 4.4% |
| g | 14704 | 4.2% |
| M | 12826 | 3.6% |
| Other values (31) | 125660 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 287165 | |
| Uppercase Letter | 56496 | 16.0% |
| Space Separator | 8976 | 2.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 55893 | |
| o | 24121 | 8.4% |
| i | 23581 | 8.2% |
| u | 22727 | 7.9% |
| r | 21574 | 7.5% |
| e | 18148 | 6.3% |
| n | 18019 | 6.3% |
| l | 15384 | 5.4% |
| g | 14704 | 5.1% |
| m | 12504 | 4.4% |
| Other values (14) | 60510 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 12826 | |
| R | 9745 | |
| K | 9327 | |
| S | 5007 | 8.9% |
| N | 4607 | 8.2% |
| B | 3864 | 6.8% |
| U | 2728 | 4.8% |
| I | 1978 | 3.5% |
| L | 1734 | 3.1% |
| T | 1096 | 1.9% |
| Other values (6) | 3584 | 6.3% |
Space Separator
| Value | Count | Frequency (%) |
| 8976 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 343661 | |
| Common | 8976 | 2.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 55893 | |
| o | 24121 | 7.0% |
| i | 23581 | 6.9% |
| u | 22727 | 6.6% |
| r | 21574 | 6.3% |
| e | 18148 | 5.3% |
| n | 18019 | 5.2% |
| l | 15384 | 4.5% |
| g | 14704 | 4.3% |
| M | 12826 | 3.7% |
| Other values (30) | 116684 |
Common
| Value | Count | Frequency (%) |
| 8976 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 352637 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 55893 | |
| o | 24121 | 6.8% |
| i | 23581 | 6.7% |
| u | 22727 | 6.4% |
| r | 21574 | 6.1% |
| e | 18148 | 5.1% |
| n | 18019 | 5.1% |
| l | 15384 | 4.4% |
| g | 14704 | 4.2% |
| M | 12826 | 3.6% |
| Other values (31) | 125660 |
ward
Text
| Distinct | 2076 |
|---|---|
| Distinct (%) | 4.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 23 |
|---|---|
| Median length | 19 |
| Mean length | 7.49760101 |
| Min length | 3 |
Characters and Unicode
| Total characters | 356286 |
|---|---|
| Distinct characters | 54 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 44 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Bashinet |
|---|---|
| 2nd row | Lamaiti |
| 3rd row | Ndalambo |
| 4th row | Chimala |
| 5th row | Chakwale |
| Value | Count | Frequency (%) |
| mashariki | 464 | 0.9% |
| urban | 430 | 0.8% |
| siha | 346 | 0.7% |
| kusini | 305 | 0.6% |
| magharibi | 291 | 0.6% |
| igosi | 242 | 0.5% |
| masama | 241 | 0.5% |
| machame | 221 | 0.4% |
| kati | 218 | 0.4% |
| imalinyi | 203 | 0.4% |
| Other values (2089) | 48833 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 55539 | |
| i | 32184 | 9.0% |
| n | 23548 | 6.6% |
| u | 21647 | 6.1% |
| o | 20887 | 5.9% |
| e | 18776 | 5.3% |
| g | 16925 | 4.8% |
| M | 15049 | 4.2% |
| m | 12995 | 3.6% |
| l | 12565 | 3.5% |
| Other values (44) | 126171 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 299424 | |
| Uppercase Letter | 51590 | 14.5% |
| Space Separator | 4303 | 1.2% |
| Other Punctuation | 951 | 0.3% |
| Dash Punctuation | 18 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 55539 | |
| i | 32184 | |
| n | 23548 | 7.9% |
| u | 21647 | 7.2% |
| o | 20887 | 7.0% |
| e | 18776 | 6.3% |
| g | 16925 | 5.7% |
| m | 12995 | 4.3% |
| l | 12565 | 4.2% |
| r | 10479 | 3.5% |
| Other values (15) | 73879 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 15049 | |
| K | 8960 | |
| I | 4863 | 9.4% |
| N | 4716 | 9.1% |
| S | 2704 | 5.2% |
| L | 2559 | 5.0% |
| B | 2534 | 4.9% |
| U | 2320 | 4.5% |
| C | 1693 | 3.3% |
| R | 1341 | 2.6% |
| Other values (15) | 4851 | 9.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 830 | |
| / | 121 | 12.7% |
Space Separator
| Value | Count | Frequency (%) |
| 4303 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 18 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 351014 | |
| Common | 5272 | 1.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 55539 | |
| i | 32184 | 9.2% |
| n | 23548 | 6.7% |
| u | 21647 | 6.2% |
| o | 20887 | 6.0% |
| e | 18776 | 5.3% |
| g | 16925 | 4.8% |
| M | 15049 | 4.3% |
| m | 12995 | 3.7% |
| l | 12565 | 3.6% |
| Other values (40) | 120899 |
Common
| Value | Count | Frequency (%) |
| 4303 | ||
| ' | 830 | 15.7% |
| / | 121 | 2.3% |
| - | 18 | 0.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 356286 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 55539 | |
| i | 32184 | 9.0% |
| n | 23548 | 6.6% |
| u | 21647 | 6.1% |
| o | 20887 | 5.9% |
| e | 18776 | 5.3% |
| g | 16925 | 4.8% |
| M | 15049 | 4.2% |
| m | 12995 | 3.6% |
| l | 12565 | 3.5% |
| Other values (44) | 126171 |
population
Real number (ℝ)
ZEROS 
| Distinct | 971 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 179.5282828 |
| Minimum | 0 |
|---|---|
| Maximum | 30500 |
| Zeros | 17048 |
| Zeros (%) | 35.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 25 |
| Q3 | 213 |
| 95-th percentile | 678 |
| Maximum | 30500 |
| Range | 30500 |
| Interquartile range (IQR) | 213 |
Descriptive statistics
| Standard deviation | 472.7729975 |
|---|---|
| Coefficient of variation (CV) | 2.633417922 |
| Kurtosis | 468.8981322 |
| Mean | 179.5282828 |
| Median Absolute Deviation (MAD) | 25 |
| Skewness | 13.53762994 |
| Sum | 8531184 |
| Variance | 223514.3071 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 17048 | |
| 1 | 5655 | 11.9% |
| 200 | 1553 | 3.3% |
| 150 | 1512 | 3.2% |
| 250 | 1364 | 2.9% |
| 300 | 1173 | 2.5% |
| 100 | 940 | 2.0% |
| 50 | 924 | 1.9% |
| 500 | 825 | 1.7% |
| 350 | 778 | 1.6% |
| Other values (961) | 15748 |
| Value | Count | Frequency (%) |
| 0 | 17048 | |
| 1 | 5655 | 11.9% |
| 2 | 3 | < 0.1% |
| 3 | 4 | < 0.1% |
| 4 | 12 | < 0.1% |
| Value | Count | Frequency (%) |
| 30500 | 1 | |
| 15300 | 1 | |
| 11463 | 1 | |
| 10000 | 1 | |
| 9865 | 1 |
public_meeting
Boolean
IMBALANCE  MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2689 |
| Missing (%) | 5.7% |
| Memory size | 371.4 KiB |
| True | |
|---|---|
| False | |
| (Missing) | 2689 |
| Value | Count | Frequency (%) |
| True | 40743 | |
| False | 4088 | 8.6% |
| (Missing) | 2689 | 5.7% |
recorded_by
Text
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Characters and Unicode
| Total characters | 1092960 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | GeoData Consultants Ltd |
|---|---|
| 2nd row | GeoData Consultants Ltd |
| 3rd row | GeoData Consultants Ltd |
| 4th row | GeoData Consultants Ltd |
| 5th row | GeoData Consultants Ltd |
| Value | Count | Frequency (%) |
| geodata | 47520 | |
| consultants | 47520 | |
| ltd | 47520 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 190080 | |
| a | 142560 | |
| o | 95040 | |
| 95040 | ||
| n | 95040 | |
| s | 95040 | |
| G | 47520 | 4.3% |
| e | 47520 | 4.3% |
| D | 47520 | 4.3% |
| C | 47520 | 4.3% |
| Other values (4) | 190080 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 807840 | |
| Uppercase Letter | 190080 | 17.4% |
| Space Separator | 95040 | 8.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 190080 | |
| a | 142560 | |
| o | 95040 | |
| n | 95040 | |
| s | 95040 | |
| e | 47520 | 5.9% |
| u | 47520 | 5.9% |
| l | 47520 | 5.9% |
| d | 47520 | 5.9% |
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 47520 | |
| D | 47520 | |
| C | 47520 | |
| L | 47520 |
Space Separator
| Value | Count | Frequency (%) |
| 95040 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 997920 | |
| Common | 95040 | 8.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 190080 | |
| a | 142560 | |
| o | 95040 | |
| n | 95040 | |
| s | 95040 | |
| G | 47520 | 4.8% |
| e | 47520 | 4.8% |
| D | 47520 | 4.8% |
| C | 47520 | 4.8% |
| u | 47520 | 4.8% |
| Other values (3) | 142560 |
Common
| Value | Count | Frequency (%) |
| 95040 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1092960 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 190080 | |
| a | 142560 | |
| o | 95040 | |
| 95040 | ||
| n | 95040 | |
| s | 95040 | |
| G | 47520 | 4.3% |
| e | 47520 | 4.3% |
| D | 47520 | 4.3% |
| C | 47520 | 4.3% |
| Other values (4) | 190080 |
MISSING 
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 3103 |
| Missing (%) | 6.5% |
| Memory size | 371.4 KiB |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.642073981 |
| Min length | 3 |
Characters and Unicode
| Total characters | 206187 |
|---|---|
| Distinct characters | 28 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Water Board |
|---|---|
| 2nd row | VWC |
| 3rd row | VWC |
| 4th row | VWC |
| 5th row | VWC |
| Value | Count | Frequency (%) |
| vwc | 29462 | |
| water | 4697 | 9.4% |
| wug | 4161 | 8.3% |
| authority | 2522 | 5.0% |
| wua | 2312 | 4.6% |
| board | 2175 | 4.4% |
| parastatal | 1346 | 2.7% |
| private | 862 | 1.7% |
| operator | 862 | 1.7% |
| company | 820 | 1.6% |
| Other values (3) | 757 | 1.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| W | 40707 | |
| C | 30357 | |
| V | 29462 | |
| a | 17322 | |
| t | 14839 | 7.2% |
| r | 14008 | 6.8% |
| o | 7241 | 3.5% |
| e | 7047 | 3.4% |
| U | 6473 | 3.1% |
| 5559 | 2.7% | |
| Other values (18) | 33172 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 118612 | |
| Lowercase Letter | 82016 | |
| Space Separator | 5559 | 2.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 17322 | |
| t | 14839 | |
| r | 14008 | |
| o | 7241 | |
| e | 7047 | |
| i | 3384 | 4.1% |
| y | 3342 | 4.1% |
| h | 3148 | 3.8% |
| u | 2578 | 3.1% |
| d | 2175 | 2.7% |
| Other values (6) | 6932 |
Uppercase Letter
| Value | Count | Frequency (%) |
| W | 40707 | |
| C | 30357 | |
| V | 29462 | |
| U | 6473 | 5.5% |
| G | 4161 | 3.5% |
| A | 2312 | 1.9% |
| P | 2208 | 1.9% |
| B | 2175 | 1.8% |
| O | 626 | 0.5% |
| S | 75 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 5559 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 200628 | |
| Common | 5559 | 2.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| W | 40707 | |
| C | 30357 | |
| V | 29462 | |
| a | 17322 | |
| t | 14839 | 7.4% |
| r | 14008 | 7.0% |
| o | 7241 | 3.6% |
| e | 7047 | 3.5% |
| U | 6473 | 3.2% |
| G | 4161 | 2.1% |
| Other values (17) | 29011 |
Common
| Value | Count | Frequency (%) |
| 5559 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 206187 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| W | 40707 | |
| C | 30357 | |
| V | 29462 | |
| a | 17322 | |
| t | 14839 | 7.2% |
| r | 14008 | 6.8% |
| o | 7241 | 3.5% |
| e | 7047 | 3.4% |
| U | 6473 | 3.1% |
| 5559 | 2.7% | |
| Other values (18) | 33172 |
scheme_name
Text
MISSING 
| Distinct | 2540 |
|---|---|
| Distinct (%) | 10.4% |
| Missing | 23036 |
| Missing (%) | 48.5% |
| Memory size | 371.4 KiB |
Length
| Max length | 46 |
|---|---|
| Median length | 37 |
| Mean length | 14.50743343 |
| Min length | 1 |
Characters and Unicode
| Total characters | 355200 |
|---|---|
| Distinct characters | 67 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 678 ? |
|---|---|
| Unique (%) | 2.8% |
Sample
| 1st row | Olikimo water project |
|---|---|
| 2nd row | S |
| 3rd row | Fufu |
| 4th row | Malemeu gravity water supply |
| 5th row | M |
| Value | Count | Frequency (%) |
| water | 7814 | 13.7% |
| supply | 5394 | 9.4% |
| scheme | 2019 | 3.5% |
| wa | 1712 | 3.0% |
| gravity | 1521 | 2.7% |
| maji | 1076 | 1.9% |
| pipe | 1070 | 1.9% |
| mradi | 875 | 1.5% |
| line | 815 | 1.4% |
| supplied | 681 | 1.2% |
| Other values (2391) | 34137 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 38748 | 10.9% |
| 33010 | 9.3% | |
| e | 27675 | 7.8% |
| i | 21085 | 5.9% |
| p | 17915 | 5.0% |
| r | 17443 | 4.9% |
| t | 15351 | 4.3% |
| u | 14711 | 4.1% |
| l | 13851 | 3.9% |
| n | 13658 | 3.8% |
| Other values (57) | 141753 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 280627 | |
| Uppercase Letter | 39741 | 11.2% |
| Space Separator | 33010 | 9.3% |
| Other Punctuation | 1040 | 0.3% |
| Dash Punctuation | 430 | 0.1% |
| Open Punctuation | 157 | < 0.1% |
| Decimal Number | 118 | < 0.1% |
| Modifier Symbol | 52 | < 0.1% |
| Close Punctuation | 25 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 38748 | |
| e | 27675 | 9.9% |
| i | 21085 | 7.5% |
| p | 17915 | 6.4% |
| r | 17443 | 6.2% |
| t | 15351 | 5.5% |
| u | 14711 | 5.2% |
| l | 13851 | 4.9% |
| n | 13658 | 4.9% |
| o | 13493 | 4.8% |
| Other values (16) | 86697 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 7422 | |
| K | 4500 | |
| N | 3042 | 7.7% |
| S | 3041 | 7.7% |
| A | 2246 | 5.7% |
| I | 2134 | 5.4% |
| W | 2067 | 5.2% |
| B | 1928 | 4.9% |
| L | 1669 | 4.2% |
| U | 1429 | 3.6% |
| Other values (15) | 10263 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 52 | |
| 3 | 41 | |
| 1 | 6 | 5.1% |
| 4 | 6 | 5.1% |
| 7 | 5 | 4.2% |
| 5 | 4 | 3.4% |
| 0 | 2 | 1.7% |
| 6 | 2 | 1.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 743 | |
| / | 290 | 27.9% |
| & | 7 | 0.7% |
Space Separator
| Value | Count | Frequency (%) |
| 33010 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 430 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 157 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 52 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 25 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 320368 | |
| Common | 34832 | 9.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 38748 | 12.1% |
| e | 27675 | 8.6% |
| i | 21085 | 6.6% |
| p | 17915 | 5.6% |
| r | 17443 | 5.4% |
| t | 15351 | 4.8% |
| u | 14711 | 4.6% |
| l | 13851 | 4.3% |
| n | 13658 | 4.3% |
| o | 13493 | 4.2% |
| Other values (41) | 126438 |
Common
| Value | Count | Frequency (%) |
| 33010 | ||
| ' | 743 | 2.1% |
| - | 430 | 1.2% |
| / | 290 | 0.8% |
| ( | 157 | 0.5% |
| ` | 52 | 0.1% |
| 2 | 52 | 0.1% |
| 3 | 41 | 0.1% |
| ) | 25 | 0.1% |
| & | 7 | < 0.1% |
| Other values (6) | 25 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 355200 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 38748 | 10.9% |
| 33010 | 9.3% | |
| e | 27675 | 7.8% |
| i | 21085 | 5.9% |
| p | 17915 | 5.0% |
| r | 17443 | 4.9% |
| t | 15351 | 4.3% |
| u | 14711 | 4.1% |
| l | 13851 | 3.9% |
| n | 13658 | 3.8% |
| Other values (57) | 141753 |
permit
Boolean
MISSING 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2439 |
| Missing (%) | 5.1% |
| Memory size | 371.4 KiB |
| True | |
|---|---|
| False | |
| (Missing) | 2439 |
| Value | Count | Frequency (%) |
| True | 31028 | |
| False | 14053 | |
| (Missing) | 2439 | 5.1% |
construction_year
Real number (ℝ)
ZEROS 
| Distinct | 55 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1303.353199 |
| Minimum | 0 |
|---|---|
| Maximum | 2013 |
| Zeros | 16503 |
| Zeros (%) | 34.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 371.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1986 |
| Q3 | 2004 |
| 95-th percentile | 2010 |
| Maximum | 2013 |
| Range | 2013 |
| Interquartile range (IQR) | 2004 |
Descriptive statistics
| Standard deviation | 950.763878 |
|---|---|
| Coefficient of variation (CV) | 0.7294752328 |
| Kurtosis | -1.588462948 |
| Mean | 1303.353199 |
| Median Absolute Deviation (MAD) | 22 |
| Skewness | -0.6411804174 |
| Sum | 61935344 |
| Variance | 903951.9517 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 16503 | |
| 2010 | 2133 | 4.5% |
| 2008 | 2124 | 4.5% |
| 2009 | 2027 | 4.3% |
| 2000 | 1682 | 3.5% |
| 2007 | 1275 | 2.7% |
| 2006 | 1174 | 2.5% |
| 2003 | 1035 | 2.2% |
| 2011 | 1003 | 2.1% |
| 2012 | 883 | 1.9% |
| Other values (45) | 17681 |
| Value | Count | Frequency (%) |
| 0 | 16503 | |
| 1960 | 87 | 0.2% |
| 1961 | 16 | < 0.1% |
| 1962 | 27 | 0.1% |
| 1963 | 76 | 0.2% |
| Value | Count | Frequency (%) |
| 2013 | 134 | 0.3% |
| 2012 | 883 | |
| 2011 | 1003 | |
| 2010 | 2133 | |
| 2009 | 2027 |
extraction_type
Text
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 25 |
|---|---|
| Median length | 17 |
| Mean length | 7.729356061 |
| Min length | 3 |
Characters and Unicode
| Total characters | 367299 |
|---|---|
| Distinct characters | 29 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gravity |
|---|---|
| 2nd row | india mark ii |
| 3rd row | other |
| 4th row | gravity |
| 5th row | other |
| Value | Count | Frequency (%) |
| gravity | 21340 | |
| nira/tanira | 6566 | 11.7% |
| other | 5776 | 10.3% |
| submersible | 3851 | 6.8% |
| swn | 3143 | 5.6% |
| 80 | 2965 | 5.3% |
| mono | 2284 | 4.1% |
| india | 1991 | 3.5% |
| mark | 1991 | 3.5% |
| ii | 1920 | 3.4% |
| Other values (13) | 4516 | 8.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 48049 | |
| r | 47875 | |
| a | 46574 | |
| t | 33682 | |
| v | 22749 | 6.2% |
| y | 21412 | 5.8% |
| g | 21342 | 5.8% |
| n | 20638 | 5.6% |
| e | 15337 | 4.2% |
| s | 11958 | 3.3% |
| Other values (19) | 77683 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 344996 | |
| Space Separator | 8823 | 2.4% |
| Other Punctuation | 6568 | 1.8% |
| Decimal Number | 6286 | 1.7% |
| Dash Punctuation | 626 | 0.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 48049 | |
| r | 47875 | |
| a | 46574 | |
| t | 33682 | |
| v | 22749 | |
| y | 21412 | 6.2% |
| g | 21342 | 6.2% |
| n | 20638 | 6.0% |
| e | 15337 | 4.4% |
| s | 11958 | 3.5% |
| Other values (13) | 55380 |
Decimal Number
| Value | Count | Frequency (%) |
| 8 | 3143 | |
| 0 | 2965 | |
| 1 | 178 | 2.8% |
Space Separator
| Value | Count | Frequency (%) |
| 8823 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 6568 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 626 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 344996 | |
| Common | 22303 | 6.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 48049 | |
| r | 47875 | |
| a | 46574 | |
| t | 33682 | |
| v | 22749 | |
| y | 21412 | 6.2% |
| g | 21342 | 6.2% |
| n | 20638 | 6.0% |
| e | 15337 | 4.4% |
| s | 11958 | 3.5% |
| Other values (13) | 55380 |
Common
| Value | Count | Frequency (%) |
| 8823 | ||
| / | 6568 | |
| 8 | 3143 | 14.1% |
| 0 | 2965 | 13.3% |
| - | 626 | 2.8% |
| 1 | 178 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 367299 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 48049 | |
| r | 47875 | |
| a | 46574 | |
| t | 33682 | |
| v | 22749 | 6.2% |
| y | 21412 | 5.8% |
| g | 21342 | 5.8% |
| n | 20638 | 5.6% |
| e | 15337 | 4.2% |
| s | 11958 | 3.3% |
| Other values (19) | 77683 |
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 15 |
|---|---|
| Median length | 14 |
| Mean length | 7.884617003 |
| Min length | 4 |
Characters and Unicode
| Total characters | 374677 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gravity |
|---|---|
| 2nd row | india mark ii |
| 3rd row | other |
| 4th row | gravity |
| 5th row | other |
| Value | Count | Frequency (%) |
| gravity | 21340 | |
| nira/tanira | 6566 | 11.9% |
| other | 5543 | 10.0% |
| submersible | 4962 | 9.0% |
| swn | 2965 | 5.4% |
| 80 | 2965 | 5.4% |
| mono | 2284 | 4.1% |
| mark | 1991 | 3.6% |
| india | 1991 | 3.6% |
| ii | 1920 | 3.5% |
| Other values (7) | 2709 | 4.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 48962 | |
| r | 48939 | |
| a | 46720 | |
| t | 33551 | |
| v | 22749 | 6.1% |
| g | 21340 | 5.7% |
| y | 21340 | 5.7% |
| n | 20747 | 5.5% |
| e | 17420 | 4.6% |
| s | 12889 | 3.4% |
| Other values (16) | 80020 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 354381 | |
| Space Separator | 7716 | 2.1% |
| Other Punctuation | 6566 | 1.8% |
| Decimal Number | 5930 | 1.6% |
| Dash Punctuation | 84 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 48962 | |
| r | 48939 | |
| a | 46720 | |
| t | 33551 | |
| v | 22749 | 6.4% |
| g | 21340 | 6.0% |
| y | 21340 | 6.0% |
| n | 20747 | 5.9% |
| e | 17420 | 4.9% |
| s | 12889 | 3.6% |
| Other values (11) | 59724 |
Decimal Number
| Value | Count | Frequency (%) |
| 8 | 2965 | |
| 0 | 2965 |
Space Separator
| Value | Count | Frequency (%) |
| 7716 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 6566 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 84 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 354381 | |
| Common | 20296 | 5.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 48962 | |
| r | 48939 | |
| a | 46720 | |
| t | 33551 | |
| v | 22749 | 6.4% |
| g | 21340 | 6.0% |
| y | 21340 | 6.0% |
| n | 20747 | 5.9% |
| e | 17420 | 4.9% |
| s | 12889 | 3.6% |
| Other values (11) | 59724 |
Common
| Value | Count | Frequency (%) |
| 7716 | ||
| / | 6566 | |
| 8 | 2965 | 14.6% |
| 0 | 2965 | 14.6% |
| - | 84 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 374677 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 48962 | |
| r | 48939 | |
| a | 46720 | |
| t | 33551 | |
| v | 22749 | 6.1% |
| g | 21340 | 5.7% |
| y | 21340 | 5.7% |
| n | 20747 | 5.5% |
| e | 17420 | 4.6% |
| s | 12889 | 3.4% |
| Other values (16) | 80020 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 12 |
|---|---|
| Median length | 11 |
| Mean length | 7.604250842 |
| Min length | 5 |
Characters and Unicode
| Total characters | 361354 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | gravity |
|---|---|
| 2nd row | handpump |
| 3rd row | other |
| 4th row | gravity |
| 5th row | other |
| Value | Count | Frequency (%) |
| gravity | 21340 | |
| handpump | 13222 | |
| other | 5150 | 10.8% |
| submersible | 4962 | 10.4% |
| motorpump | 2386 | 5.0% |
| rope | 376 | 0.8% |
| pump | 376 | 0.8% |
| wind-powered | 84 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 34562 | 9.6% |
| r | 34298 | 9.5% |
| p | 32428 | 9.0% |
| t | 28876 | 8.0% |
| i | 26386 | 7.3% |
| m | 23332 | 6.5% |
| g | 21340 | 5.9% |
| y | 21340 | 5.9% |
| v | 21340 | 5.9% |
| u | 20946 | 5.8% |
| Other values (11) | 96506 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 360894 | |
| Space Separator | 376 | 0.1% |
| Dash Punctuation | 84 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 34562 | 9.6% |
| r | 34298 | 9.5% |
| p | 32428 | 9.0% |
| t | 28876 | 8.0% |
| i | 26386 | 7.3% |
| m | 23332 | 6.5% |
| g | 21340 | 5.9% |
| y | 21340 | 5.9% |
| v | 21340 | 5.9% |
| u | 20946 | 5.8% |
| Other values (9) | 96046 |
Space Separator
| Value | Count | Frequency (%) |
| 376 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 84 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 360894 | |
| Common | 460 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 34562 | 9.6% |
| r | 34298 | 9.5% |
| p | 32428 | 9.0% |
| t | 28876 | 8.0% |
| i | 26386 | 7.3% |
| m | 23332 | 6.5% |
| g | 21340 | 5.9% |
| y | 21340 | 5.9% |
| v | 21340 | 5.9% |
| u | 20946 | 5.8% |
| Other values (9) | 96046 |
Common
| Value | Count | Frequency (%) |
| 376 | ||
| - | 84 | 18.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 361354 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 34562 | 9.6% |
| r | 34298 | 9.5% |
| p | 32428 | 9.0% |
| t | 28876 | 8.0% |
| i | 26386 | 7.3% |
| m | 23332 | 6.5% |
| g | 21340 | 5.9% |
| y | 21340 | 5.9% |
| v | 21340 | 5.9% |
| u | 20946 | 5.8% |
| Other values (11) | 96506 |
management
Text
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 16 |
|---|---|
| Median length | 3 |
| Mean length | 4.341582492 |
| Min length | 3 |
Characters and Unicode
| Total characters | 206312 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | water board |
|---|---|
| 2nd row | vwc |
| 3rd row | vwc |
| 4th row | vwc |
| 5th row | vwc |
| Value | Count | Frequency (%) |
| vwc | 32455 | |
| wug | 5204 | 10.0% |
| water | 3042 | 5.8% |
| board | 2326 | 4.4% |
| wua | 2033 | 3.9% |
| private | 1566 | 3.0% |
| operator | 1566 | 3.0% |
| parastatal | 1413 | 2.7% |
| other | 764 | 1.5% |
| authority | 716 | 1.4% |
| Other values (5) | 1205 | 2.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| w | 43190 | |
| v | 34021 | |
| c | 33060 | |
| a | 17425 | |
| r | 13022 | 6.3% |
| t | 11322 | 5.5% |
| u | 8472 | 4.1% |
| o | 8080 | 3.9% |
| e | 6938 | 3.4% |
| g | 5204 | 2.5% |
| Other values (13) | 25578 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 201461 | |
| Space Separator | 4770 | 2.3% |
| Dash Punctuation | 81 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| w | 43190 | |
| v | 34021 | |
| c | 33060 | |
| a | 17425 | |
| r | 13022 | 6.5% |
| t | 11322 | 5.6% |
| u | 8472 | 4.2% |
| o | 8080 | 4.0% |
| e | 6938 | 3.4% |
| g | 5204 | 2.6% |
| Other values (11) | 20727 |
Space Separator
| Value | Count | Frequency (%) |
| 4770 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 81 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 201461 | |
| Common | 4851 | 2.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| w | 43190 | |
| v | 34021 | |
| c | 33060 | |
| a | 17425 | |
| r | 13022 | 6.5% |
| t | 11322 | 5.6% |
| u | 8472 | 4.2% |
| o | 8080 | 4.0% |
| e | 6938 | 3.4% |
| g | 5204 | 2.6% |
| Other values (11) | 20727 |
Common
| Value | Count | Frequency (%) |
| 4770 | ||
| - | 81 | 1.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 206312 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| w | 43190 | |
| v | 34021 | |
| c | 33060 | |
| a | 17425 | |
| r | 13022 | 6.3% |
| t | 11322 | 5.5% |
| u | 8472 | 4.1% |
| o | 8080 | 3.9% |
| e | 6938 | 3.4% |
| g | 5204 | 2.5% |
| Other values (13) | 25578 |
management_group
Text
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 9.890824916 |
| Min length | 5 |
Characters and Unicode
| Total characters | 470012 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | user-group |
|---|---|
| 2nd row | user-group |
| 3rd row | user-group |
| 4th row | user-group |
| 5th row | user-group |
| Value | Count | Frequency (%) |
| user-group | 42018 | |
| commercial | 2869 | 6.0% |
| parastatal | 1413 | 3.0% |
| other | 764 | 1.6% |
| unknown | 456 | 1.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 89082 | |
| u | 84492 | |
| o | 46107 | |
| e | 45651 | |
| s | 43431 | |
| p | 43431 | |
| - | 42018 | |
| g | 42018 | |
| a | 8521 | 1.8% |
| m | 5738 | 1.2% |
| Other values (8) | 19523 | 4.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 427994 | |
| Dash Punctuation | 42018 | 8.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 89082 | |
| u | 84492 | |
| o | 46107 | |
| e | 45651 | |
| s | 43431 | |
| p | 43431 | |
| g | 42018 | |
| a | 8521 | 2.0% |
| m | 5738 | 1.3% |
| c | 5738 | 1.3% |
| Other values (7) | 13785 | 3.2% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 42018 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 427994 | |
| Common | 42018 | 8.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 89082 | |
| u | 84492 | |
| o | 46107 | |
| e | 45651 | |
| s | 43431 | |
| p | 43431 | |
| g | 42018 | |
| a | 8521 | 2.0% |
| m | 5738 | 1.3% |
| c | 5738 | 1.3% |
| Other values (7) | 13785 | 3.2% |
Common
| Value | Count | Frequency (%) |
| - | 42018 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 470012 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 89082 | |
| u | 84492 | |
| o | 46107 | |
| e | 45651 | |
| s | 43431 | |
| p | 43431 | |
| - | 42018 | |
| g | 42018 | |
| a | 8521 | 1.8% |
| m | 5738 | 1.2% |
| Other values (8) | 19523 | 4.2% |
payment
Text
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 21 |
|---|---|
| Median length | 14 |
| Mean length | 10.66984428 |
| Min length | 5 |
Characters and Unicode
| Total characters | 507031 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | pay per bucket |
|---|---|
| 2nd row | never pay |
| 3rd row | never pay |
| 4th row | pay monthly |
| 5th row | pay when scheme fails |
| Value | Count | Frequency (%) |
| pay | 40155 | |
| never | 20318 | |
| per | 7223 | 7.1% |
| bucket | 7223 | 7.1% |
| monthly | 6574 | 6.5% |
| unknown | 6521 | 6.4% |
| when | 3154 | 3.1% |
| scheme | 3154 | 3.1% |
| fails | 3154 | 3.1% |
| annually | 2886 | 2.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 65388 | |
| n | 55381 | |
| 53686 | ||
| y | 49615 | |
| a | 49081 | |
| p | 47378 | |
| r | 28385 | 5.6% |
| v | 20318 | 4.0% |
| u | 16630 | 3.3% |
| l | 15500 | 3.1% |
| Other values (11) | 105669 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 453345 | |
| Space Separator | 53686 | 10.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 65388 | |
| n | 55381 | |
| y | 49615 | |
| a | 49081 | |
| p | 47378 | |
| r | 28385 | 6.3% |
| v | 20318 | 4.5% |
| u | 16630 | 3.7% |
| l | 15500 | 3.4% |
| t | 14641 | 3.2% |
| Other values (10) | 91028 |
Space Separator
| Value | Count | Frequency (%) |
| 53686 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 453345 | |
| Common | 53686 | 10.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 65388 | |
| n | 55381 | |
| y | 49615 | |
| a | 49081 | |
| p | 47378 | |
| r | 28385 | 6.3% |
| v | 20318 | 4.5% |
| u | 16630 | 3.7% |
| l | 15500 | 3.4% |
| t | 14641 | 3.2% |
| Other values (10) | 91028 |
Common
| Value | Count | Frequency (%) |
| 53686 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 507031 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 65388 | |
| n | 55381 | |
| 53686 | ||
| y | 49615 | |
| a | 49081 | |
| p | 47378 | |
| r | 28385 | 5.6% |
| v | 20318 | 4.0% |
| u | 16630 | 3.3% |
| l | 15500 | 3.1% |
| Other values (11) | 105669 |
payment_type
Text
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.535458754 |
| Min length | 5 |
Characters and Unicode
| Total characters | 405605 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | per bucket |
|---|---|
| 2nd row | never pay |
| 3rd row | never pay |
| 4th row | monthly |
| 5th row | on failure |
| Value | Count | Frequency (%) |
| never | 20318 | |
| pay | 20318 | |
| per | 7223 | 9.2% |
| bucket | 7223 | 9.2% |
| monthly | 6574 | 8.4% |
| unknown | 6521 | 8.3% |
| on | 3154 | 4.0% |
| failure | 3154 | 4.0% |
| annually | 2886 | 3.7% |
| other | 844 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 59080 | |
| n | 55381 | |
| r | 31539 | 7.8% |
| 30695 | 7.6% | |
| y | 29778 | 7.3% |
| a | 29244 | 7.2% |
| p | 27541 | 6.8% |
| v | 20318 | 5.0% |
| u | 19784 | 4.9% |
| o | 17093 | 4.2% |
| Other values (10) | 85152 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 374910 | |
| Space Separator | 30695 | 7.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 59080 | |
| n | 55381 | |
| r | 31539 | |
| y | 29778 | |
| a | 29244 | |
| p | 27541 | 7.3% |
| v | 20318 | 5.4% |
| u | 19784 | 5.3% |
| o | 17093 | 4.6% |
| l | 15500 | 4.1% |
| Other values (9) | 69652 |
Space Separator
| Value | Count | Frequency (%) |
| 30695 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 374910 | |
| Common | 30695 | 7.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 59080 | |
| n | 55381 | |
| r | 31539 | |
| y | 29778 | |
| a | 29244 | |
| p | 27541 | 7.3% |
| v | 20318 | 5.4% |
| u | 19784 | 5.3% |
| o | 17093 | 4.6% |
| l | 15500 | 4.1% |
| Other values (9) | 69652 |
Common
| Value | Count | Frequency (%) |
| 30695 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 405605 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 59080 | |
| n | 55381 | |
| r | 31539 | 7.8% |
| 30695 | 7.6% | |
| y | 29778 | 7.3% |
| a | 29244 | 7.2% |
| p | 27541 | 6.8% |
| v | 20318 | 5.0% |
| u | 19784 | 4.9% |
| o | 17093 | 4.2% |
| Other values (10) | 85152 |
water_quality
Text
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 18 |
|---|---|
| Median length | 4 |
| Mean length | 4.301746633 |
| Min length | 4 |
Characters and Unicode
| Total characters | 204419 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | soft |
|---|---|
| 2nd row | soft |
| 3rd row | soft |
| 4th row | soft |
| 5th row | salty |
| Value | Count | Frequency (%) |
| soft | 40633 | |
| salty | 4173 | 8.7% |
| unknown | 1490 | 3.1% |
| milky | 650 | 1.4% |
| coloured | 395 | 0.8% |
| abandoned | 275 | 0.6% |
| fluoride | 179 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 44806 | |
| t | 44806 | |
| o | 43367 | |
| f | 40812 | |
| l | 5397 | 2.6% |
| n | 5020 | 2.5% |
| y | 4823 | 2.4% |
| a | 4723 | 2.3% |
| k | 2140 | 1.0% |
| u | 2064 | 1.0% |
| Other values (9) | 6461 | 3.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 204144 | |
| Space Separator | 275 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 44806 | |
| t | 44806 | |
| o | 43367 | |
| f | 40812 | |
| l | 5397 | 2.6% |
| n | 5020 | 2.5% |
| y | 4823 | 2.4% |
| a | 4723 | 2.3% |
| k | 2140 | 1.0% |
| u | 2064 | 1.0% |
| Other values (8) | 6186 | 3.0% |
Space Separator
| Value | Count | Frequency (%) |
| 275 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 204144 | |
| Common | 275 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 44806 | |
| t | 44806 | |
| o | 43367 | |
| f | 40812 | |
| l | 5397 | 2.6% |
| n | 5020 | 2.5% |
| y | 4823 | 2.4% |
| a | 4723 | 2.3% |
| k | 2140 | 1.0% |
| u | 2064 | 1.0% |
| Other values (8) | 6186 | 3.0% |
Common
| Value | Count | Frequency (%) |
| 275 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 204419 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 44806 | |
| t | 44806 | |
| o | 43367 | |
| f | 40812 | |
| l | 5397 | 2.6% |
| n | 5020 | 2.5% |
| y | 4823 | 2.4% |
| a | 4723 | 2.3% |
| k | 2140 | 1.0% |
| u | 2064 | 1.0% |
| Other values (9) | 6461 | 3.2% |
quality_group
Text
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 8 |
|---|---|
| Median length | 4 |
| Mean length | 4.235563973 |
| Min length | 4 |
Characters and Unicode
| Total characters | 201274 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | good |
|---|---|
| 2nd row | good |
| 3rd row | good |
| 4th row | good |
| 5th row | salty |
| Value | Count | Frequency (%) |
| good | 40633 | |
| salty | 4173 | 8.8% |
| unknown | 1490 | 3.1% |
| milky | 650 | 1.4% |
| colored | 395 | 0.8% |
| fluoride | 179 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 83725 | |
| d | 41207 | |
| g | 40633 | |
| l | 5397 | 2.7% |
| y | 4823 | 2.4% |
| n | 4470 | 2.2% |
| t | 4173 | 2.1% |
| a | 4173 | 2.1% |
| s | 4173 | 2.1% |
| k | 2140 | 1.1% |
| Other values (8) | 6360 | 3.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 201274 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 83725 | |
| d | 41207 | |
| g | 40633 | |
| l | 5397 | 2.7% |
| y | 4823 | 2.4% |
| n | 4470 | 2.2% |
| t | 4173 | 2.1% |
| a | 4173 | 2.1% |
| s | 4173 | 2.1% |
| k | 2140 | 1.1% |
| Other values (8) | 6360 | 3.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 201274 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 83725 | |
| d | 41207 | |
| g | 40633 | |
| l | 5397 | 2.7% |
| y | 4823 | 2.4% |
| n | 4470 | 2.2% |
| t | 4173 | 2.1% |
| a | 4173 | 2.1% |
| s | 4173 | 2.1% |
| k | 2140 | 1.1% |
| Other values (8) | 6360 | 3.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 201274 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 83725 | |
| d | 41207 | |
| g | 40633 | |
| l | 5397 | 2.7% |
| y | 4823 | 2.4% |
| n | 4470 | 2.2% |
| t | 4173 | 2.1% |
| a | 4173 | 2.1% |
| s | 4173 | 2.1% |
| k | 2140 | 1.1% |
| Other values (8) | 6360 | 3.2% |
quantity
Text
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.360079966 |
| Min length | 3 |
Characters and Unicode
| Total characters | 349751 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | insufficient |
|---|---|
| 2nd row | enough |
| 3rd row | enough |
| 4th row | insufficient |
| 5th row | enough |
| Value | Count | Frequency (%) |
| enough | 26538 | |
| insufficient | 12104 | |
| dry | 5024 | 10.6% |
| seasonal | 3225 | 6.8% |
| unknown | 629 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 55858 | |
| e | 41867 | |
| u | 39271 | |
| i | 36312 | |
| o | 30392 | |
| g | 26538 | |
| h | 26538 | |
| f | 24208 | |
| s | 18554 | 5.3% |
| t | 12104 | 3.5% |
| Other values (8) | 38109 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 349751 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 55858 | |
| e | 41867 | |
| u | 39271 | |
| i | 36312 | |
| o | 30392 | |
| g | 26538 | |
| h | 26538 | |
| f | 24208 | |
| s | 18554 | 5.3% |
| t | 12104 | 3.5% |
| Other values (8) | 38109 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 349751 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 55858 | |
| e | 41867 | |
| u | 39271 | |
| i | 36312 | |
| o | 30392 | |
| g | 26538 | |
| h | 26538 | |
| f | 24208 | |
| s | 18554 | 5.3% |
| t | 12104 | 3.5% |
| Other values (8) | 38109 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 349751 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 55858 | |
| e | 41867 | |
| u | 39271 | |
| i | 36312 | |
| o | 30392 | |
| g | 26538 | |
| h | 26538 | |
| f | 24208 | |
| s | 18554 | 5.3% |
| t | 12104 | 3.5% |
| Other values (8) | 38109 |
quantity_group
Text
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 7.360079966 |
| Min length | 3 |
Characters and Unicode
| Total characters | 349751 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | insufficient |
|---|---|
| 2nd row | enough |
| 3rd row | enough |
| 4th row | insufficient |
| 5th row | enough |
| Value | Count | Frequency (%) |
| enough | 26538 | |
| insufficient | 12104 | |
| dry | 5024 | 10.6% |
| seasonal | 3225 | 6.8% |
| unknown | 629 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 55858 | |
| e | 41867 | |
| u | 39271 | |
| i | 36312 | |
| o | 30392 | |
| g | 26538 | |
| h | 26538 | |
| f | 24208 | |
| s | 18554 | 5.3% |
| t | 12104 | 3.5% |
| Other values (8) | 38109 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 349751 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 55858 | |
| e | 41867 | |
| u | 39271 | |
| i | 36312 | |
| o | 30392 | |
| g | 26538 | |
| h | 26538 | |
| f | 24208 | |
| s | 18554 | 5.3% |
| t | 12104 | 3.5% |
| Other values (8) | 38109 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 349751 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 55858 | |
| e | 41867 | |
| u | 39271 | |
| i | 36312 | |
| o | 30392 | |
| g | 26538 | |
| h | 26538 | |
| f | 24208 | |
| s | 18554 | 5.3% |
| t | 12104 | 3.5% |
| Other values (8) | 38109 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 349751 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 55858 | |
| e | 41867 | |
| u | 39271 | |
| i | 36312 | |
| o | 30392 | |
| g | 26538 | |
| h | 26538 | |
| f | 24208 | |
| s | 18554 | 5.3% |
| t | 12104 | 3.5% |
| Other values (8) | 38109 |
source
Text
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 20 |
|---|---|
| Median length | 12 |
| Mean length | 8.986637205 |
| Min length | 3 |
Characters and Unicode
| Total characters | 427045 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | spring |
|---|---|
| 2nd row | shallow well |
| 3rd row | shallow well |
| 4th row | river |
| 5th row | shallow well |
| Value | Count | Frequency (%) |
| shallow | 13540 | |
| well | 13540 | |
| spring | 13537 | |
| machine | 8849 | |
| dbh | 8849 | |
| river | 7719 | |
| rainwater | 1829 | 2.5% |
| harvesting | 1829 | 2.5% |
| hand | 701 | 1.0% |
| dtw | 701 | 1.0% |
| Other values (4) | 1345 | 1.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 54766 | |
| r | 34640 | 8.1% |
| e | 34550 | 8.1% |
| h | 33946 | 7.9% |
| i | 33763 | 7.9% |
| a | 29688 | 7.0% |
| w | 29666 | 6.9% |
| s | 28906 | 6.8% |
| n | 26913 | 6.3% |
| 24919 | 5.8% | |
| Other values (11) | 95288 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 402126 | |
| Space Separator | 24919 | 5.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 54766 | |
| r | 34640 | |
| e | 34550 | |
| h | 33946 | |
| i | 33763 | |
| a | 29688 | 7.4% |
| w | 29666 | 7.4% |
| s | 28906 | 7.2% |
| n | 26913 | 6.7% |
| g | 15366 | 3.8% |
| Other values (10) | 79922 |
Space Separator
| Value | Count | Frequency (%) |
| 24919 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 402126 | |
| Common | 24919 | 5.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| l | 54766 | |
| r | 34640 | |
| e | 34550 | |
| h | 33946 | |
| i | 33763 | |
| a | 29688 | 7.4% |
| w | 29666 | 7.4% |
| s | 28906 | 7.2% |
| n | 26913 | 6.7% |
| g | 15366 | 3.8% |
| Other values (10) | 79922 |
Common
| Value | Count | Frequency (%) |
| 24919 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 427045 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| l | 54766 | |
| r | 34640 | 8.1% |
| e | 34550 | 8.1% |
| h | 33946 | 7.9% |
| i | 33763 | 7.9% |
| a | 29688 | 7.0% |
| w | 29666 | 6.9% |
| s | 28906 | 6.8% |
| n | 26913 | 6.3% |
| 24919 | 5.8% | |
| Other values (11) | 95288 |
source_type
Text
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 20 |
|---|---|
| Median length | 12 |
| Mean length | 9.314330808 |
| Min length | 3 |
Characters and Unicode
| Total characters | 442617 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | spring |
|---|---|
| 2nd row | shallow well |
| 3rd row | shallow well |
| 4th row | river/lake |
| 5th row | shallow well |
| Value | Count | Frequency (%) |
| shallow | 13540 | |
| well | 13540 | |
| spring | 13537 | |
| borehole | 9550 | |
| river/lake | 8325 | |
| rainwater | 1829 | 2.9% |
| harvesting | 1829 | 2.9% |
| dam | 505 | 0.8% |
| other | 234 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 72035 | |
| e | 53182 | |
| r | 45458 | |
| o | 32874 | 7.4% |
| w | 28909 | 6.5% |
| s | 28906 | 6.5% |
| a | 27857 | 6.3% |
| i | 25520 | 5.8% |
| h | 25153 | 5.7% |
| n | 17195 | 3.9% |
| Other values (10) | 85528 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 418923 | |
| Space Separator | 15369 | 3.5% |
| Other Punctuation | 8325 | 1.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 72035 | |
| e | 53182 | |
| r | 45458 | |
| o | 32874 | |
| w | 28909 | |
| s | 28906 | |
| a | 27857 | 6.6% |
| i | 25520 | 6.1% |
| h | 25153 | 6.0% |
| n | 17195 | 4.1% |
| Other values (8) | 61834 |
Space Separator
| Value | Count | Frequency (%) |
| 15369 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 8325 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 418923 | |
| Common | 23694 | 5.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| l | 72035 | |
| e | 53182 | |
| r | 45458 | |
| o | 32874 | |
| w | 28909 | |
| s | 28906 | |
| a | 27857 | 6.6% |
| i | 25520 | 6.1% |
| h | 25153 | 6.0% |
| n | 17195 | 4.1% |
| Other values (8) | 61834 |
Common
| Value | Count | Frequency (%) |
| 15369 | ||
| / | 8325 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 442617 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| l | 72035 | |
| e | 53182 | |
| r | 45458 | |
| o | 32874 | 7.4% |
| w | 28909 | 6.5% |
| s | 28906 | 6.5% |
| a | 27857 | 6.3% |
| i | 25520 | 5.8% |
| h | 25153 | 5.7% |
| n | 17195 | 3.9% |
| Other values (10) | 85528 |
source_class
Text
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.08308081 |
| Min length | 7 |
Characters and Unicode
| Total characters | 479148 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | groundwater |
|---|---|
| 2nd row | groundwater |
| 3rd row | groundwater |
| 4th row | surface |
| 5th row | groundwater |
| Value | Count | Frequency (%) |
| groundwater | 36627 | |
| surface | 10659 | 22.4% |
| unknown | 234 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 83913 | |
| u | 47520 | |
| a | 47286 | |
| e | 47286 | |
| n | 37329 | |
| o | 36861 | |
| w | 36861 | |
| g | 36627 | |
| d | 36627 | |
| t | 36627 | |
| Other values (4) | 32211 | 6.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 479148 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 83913 | |
| u | 47520 | |
| a | 47286 | |
| e | 47286 | |
| n | 37329 | |
| o | 36861 | |
| w | 36861 | |
| g | 36627 | |
| d | 36627 | |
| t | 36627 | |
| Other values (4) | 32211 | 6.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 479148 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 83913 | |
| u | 47520 | |
| a | 47286 | |
| e | 47286 | |
| n | 37329 | |
| o | 36861 | |
| w | 36861 | |
| g | 36627 | |
| d | 36627 | |
| t | 36627 | |
| Other values (4) | 32211 | 6.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 479148 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 83913 | |
| u | 47520 | |
| a | 47286 | |
| e | 47286 | |
| n | 37329 | |
| o | 36861 | |
| w | 36861 | |
| g | 36627 | |
| d | 36627 | |
| t | 36627 | |
| Other values (4) | 32211 | 6.7% |
waterpoint_type
Text
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 27 |
|---|---|
| Median length | 18 |
| Mean length | 14.80359848 |
| Min length | 3 |
Characters and Unicode
| Total characters | 703467 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | communal standpipe |
|---|---|
| 2nd row | hand pump |
| 3rd row | other |
| 4th row | communal standpipe |
| 5th row | other |
| Value | Count | Frequency (%) |
| communal | 27615 | |
| standpipe | 27615 | |
| hand | 14073 | |
| pump | 14073 | |
| other | 5098 | 5.4% |
| multiple | 4830 | 5.1% |
| improved | 639 | 0.7% |
| spring | 639 | 0.7% |
| cattle | 91 | 0.1% |
| trough | 91 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| p | 89484 | |
| m | 74776 | |
| n | 69942 | |
| a | 69398 | |
| 47248 | 6.7% | |
| u | 46609 | 6.6% |
| d | 42331 | 6.0% |
| e | 38273 | 5.4% |
| t | 37816 | 5.4% |
| l | 37366 | 5.3% |
| Other values (8) | 150224 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 656219 | |
| Space Separator | 47248 | 6.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| p | 89484 | |
| m | 74776 | |
| n | 69942 | |
| a | 69398 | |
| u | 46609 | |
| d | 42331 | 6.5% |
| e | 38273 | 5.8% |
| t | 37816 | 5.8% |
| l | 37366 | 5.7% |
| i | 33723 | 5.1% |
| Other values (7) | 116501 |
Space Separator
| Value | Count | Frequency (%) |
| 47248 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 656219 | |
| Common | 47248 | 6.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| p | 89484 | |
| m | 74776 | |
| n | 69942 | |
| a | 69398 | |
| u | 46609 | |
| d | 42331 | 6.5% |
| e | 38273 | 5.8% |
| t | 37816 | 5.8% |
| l | 37366 | 5.7% |
| i | 33723 | 5.1% |
| Other values (7) | 116501 |
Common
| Value | Count | Frequency (%) |
| 47248 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 703467 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| p | 89484 | |
| m | 74776 | |
| n | 69942 | |
| a | 69398 | |
| 47248 | 6.7% | |
| u | 46609 | 6.6% |
| d | 42331 | 6.0% |
| e | 38273 | 5.4% |
| t | 37816 | 5.4% |
| l | 37366 | 5.3% |
| Other values (8) | 150224 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 13.88882576 |
| Min length | 3 |
Characters and Unicode
| Total characters | 659997 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | communal standpipe |
|---|---|
| 2nd row | hand pump |
| 3rd row | other |
| 4th row | communal standpipe |
| 5th row | other |
| Value | Count | Frequency (%) |
| communal | 27615 | |
| standpipe | 27615 | |
| hand | 14073 | |
| pump | 14073 | |
| other | 5098 | 5.7% |
| improved | 639 | 0.7% |
| spring | 639 | 0.7% |
| cattle | 91 | 0.1% |
| trough | 91 | 0.1% |
| dam | 4 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| p | 84654 | |
| m | 69946 | |
| n | 69942 | |
| a | 69398 | |
| 42418 | 6.4% | |
| d | 42331 | 6.4% |
| u | 41779 | 6.3% |
| e | 33443 | 5.1% |
| o | 33443 | 5.1% |
| t | 32986 | 5.0% |
| Other values (8) | 139657 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 617579 | |
| Space Separator | 42418 | 6.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| p | 84654 | |
| m | 69946 | |
| n | 69942 | |
| a | 69398 | |
| d | 42331 | 6.9% |
| u | 41779 | 6.8% |
| e | 33443 | 5.4% |
| o | 33443 | 5.4% |
| t | 32986 | 5.3% |
| i | 28893 | 4.7% |
| Other values (7) | 110764 |
Space Separator
| Value | Count | Frequency (%) |
| 42418 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 617579 | |
| Common | 42418 | 6.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| p | 84654 | |
| m | 69946 | |
| n | 69942 | |
| a | 69398 | |
| d | 42331 | 6.9% |
| u | 41779 | 6.8% |
| e | 33443 | 5.4% |
| o | 33443 | 5.4% |
| t | 32986 | 5.3% |
| i | 28893 | 4.7% |
| Other values (7) | 110764 |
Common
| Value | Count | Frequency (%) |
| 42418 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 659997 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| p | 84654 | |
| m | 69946 | |
| n | 69942 | |
| a | 69398 | |
| 42418 | 6.4% | |
| d | 42331 | 6.4% |
| u | 41779 | 6.3% |
| e | 33443 | 5.1% |
| o | 33443 | 5.1% |
| t | 32986 | 5.0% |
| Other values (8) | 139657 |
status_group
Text
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 371.4 KiB |
Length
| Max length | 23 |
|---|---|
| Median length | 10 |
| Mean length | 12.48455387 |
| Min length | 10 |
Characters and Unicode
| Total characters | 593266 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | functional |
|---|---|
| 2nd row | functional |
| 3rd row | non functional |
| 4th row | non functional |
| 5th row | non functional |
| Value | Count | Frequency (%) |
| functional | 47520 | |
| non | 18252 | 25.1% |
| needs | 3466 | 4.8% |
| repair | 3466 | 4.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 135010 | |
| o | 65772 | |
| i | 50986 | 8.6% |
| a | 50986 | 8.6% |
| f | 47520 | 8.0% |
| u | 47520 | 8.0% |
| c | 47520 | 8.0% |
| t | 47520 | 8.0% |
| l | 47520 | 8.0% |
| 25184 | 4.2% | |
| Other values (5) | 27728 | 4.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 568082 | |
| Space Separator | 25184 | 4.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 135010 | |
| o | 65772 | |
| i | 50986 | 9.0% |
| a | 50986 | 9.0% |
| f | 47520 | 8.4% |
| u | 47520 | 8.4% |
| c | 47520 | 8.4% |
| t | 47520 | 8.4% |
| l | 47520 | 8.4% |
| e | 10398 | 1.8% |
| Other values (4) | 17330 | 3.1% |
Space Separator
| Value | Count | Frequency (%) |
| 25184 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 568082 | |
| Common | 25184 | 4.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 135010 | |
| o | 65772 | |
| i | 50986 | 9.0% |
| a | 50986 | 9.0% |
| f | 47520 | 8.4% |
| u | 47520 | 8.4% |
| c | 47520 | 8.4% |
| t | 47520 | 8.4% |
| l | 47520 | 8.4% |
| e | 10398 | 1.8% |
| Other values (4) | 17330 | 3.1% |
Common
| Value | Count | Frequency (%) |
| 25184 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 593266 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 135010 | |
| o | 65772 | |
| i | 50986 | 8.6% |
| a | 50986 | 8.6% |
| f | 47520 | 8.0% |
| u | 47520 | 8.0% |
| c | 47520 | 8.0% |
| t | 47520 | 8.0% |
| l | 47520 | 8.0% |
| 25184 | 4.2% | |
| Other values (5) | 27728 | 4.7% |